Robust speech enhancement techniques for ASR in non-stationary noise and dynamic environments
نویسندگان
چکیده
In the current ASR systems the presence of competing speakers greatly degrades the recognition performance. This phenomenon is getting even more prominent in the case of hands-free, far-field ASR systems like the “Smart-TV” systems, where reverberation and non-stationary noise pose additional challenges. Furthermore, speakers are, most often, not standing still while speaking. To address these issues, we propose a cascaded system that includes Time Differences of Arrival estimation, multi-channel Wiener Filtering, nonnegative matrix factorization (NMF), multi-condition training, and robust feature extraction, whereas each of them additively improves the overall performance. The final cascaded system presents an average of 50% and 45% relative improvement in ASR word accuracy for the CHiME 2011(non-stationary noise) and CHiME 2012 (non-stationary noise plus speaker head movement) tasks, respectively.
منابع مشابه
Adaptive Enhancement of Speech Signals for Robust ASR
Behavior of the least squares filter (LeSF) is analyzed for a class of non-stationary signals that are composed of multiple sinusoids whose frequencies and the amplitudes may vary from block to block and which are embedded in white noise. Analytic expressions for the weights and the output of the LeSF are derived as a function of the block length and the signal SNR computed over the correspondi...
متن کاملThe Munich Feature Enhancement Approach to the 2nd Chime Challenge Using Blstm Recurrent Neural Networks
We present a highly efficient, data-based method for monaural feature enhancement targeted at automatic speech recognition (ASR) in reverberant environments with highly non-stationary noise. Our approach is based on bidirectional Long Short-Term Memory recurrent neural networks trained to map noise corrupted features to clean features. In extensive test runs, enhanced features are evaluated wit...
متن کاملDesign of robust subtractive beamformer for noisy speech recognition
There is a big demand for noise reduction to enhance ASR robustness. A great variety of noise reduction methods have been proposed, but almost none of them can reduce non-stationary noises. The authers have proposed an algorithm that can reduce noises that former methods, such as adaptive beamformers, nd di cult to deal with. In this paper, the authors verify the proposed method as a front-end ...
متن کاملSpeech Enhancement in No Environme
This paper presents a speech enhancement using a noise estimation based on the ratio of the noisy speech and its minimum (NSMR) for non-stationary noise environments. The noise estimator is a very simple but highly effective real time approach for single channel noise reduction. The enhanced speech is free of musical tones and reverberation artifacts and sounds very natural compared to methods ...
متن کاملNanyang Technological University Model - Based Noise Robust Speech Recognition
Noise robustness is a challenging problem when automatic speech recognition (ASR) system is deployed in real life applications. This report examines techniques to improve the robustness of ASR systems. Particularly, we focus on a group of model-based noise robust techniques, called vector Taylor series (VTS) method, that adapt the acoustic model of ASR systems towards noisy test data using the ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013